A Unified Framework and Sequential Data Cleaning Approach for a Data Warehouse

نویسندگان

  • J. Jebamalar Tamilselvi
  • V. Saravanan
چکیده

The data cleaning is the process of identifying and removing the errors in the data warehouse. Data cleaning is very important in data mining process. Most of the organizations are in the need of quality data. The quality of the data needs to be improved in the data warehouse before the mining process. The framework available for data cleaning offers the fundamental services for data cleaning such as attribute selection, formation of tokens, selection of clustering algorithm, selection of similarity function, selection of elimination function and merge function. This research paper deals about the new framework for data cleaning. It also presents a solution to handle data cleaning process by using a new framework design in a sequential order.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uclean: a Requirement Based Object- Oriented Etl Framework

Data warehouse is used to provide effective results from multidimensional data analysis. The accuracy and correctness of these results depend on the quality of the data. To improve data quality, data must be properly extracted, transformed and loaded into the data warehouse. This ETL process is the key to the success of a data warehouse. In this paper we propose a conceptual ETL framework for a...

متن کامل

Fuzzy multi-criteria selection procedures in choosing data source

Technology assessment and selection has a substantial impact on organizations procedures in regards to technology transfer. Technological decisions are usually made by a group of experts, and whereby integrity of these viewpoints to a single decision can be quite complex. Today, operational databases and data warehouses exist to manage and organize data with specific features and henceforth, th...

متن کامل

XML based Framework for ETL Processes For Relational Databases

In Data Warehousing, Extraction-Transformation-Loading (ETL) are the key tasks that are responsible for the extraction of data from several sources, their cleansing, customization and insertion into data warehouse [10]. More specifically ETL tools are category of specialized tools with the task of dealing with data warehouse cleaning and loading problems. These task are very critical in every d...

متن کامل

Data Mapping Diagrams for Data Warehouse Design with UML

In DataWarehouse (DW) scenarios, ETL (Extraction, Transformation, Loading) processes are responsible for the extraction of data from heterogeneous operational data sources, their transformation (conversion, cleaning, normalization, etc.) and their loading into the DW. In this paper, we present a framework for the design of the DW back-stage (and the respective ETL processes) based on the key ob...

متن کامل

ارائه مدل تلفیقی برای ارزیابی آمادگی سازمان ها جهت پیاده سازی سیستم انباره داده با استفاده ازتحلیل سلسله مراتبی

Enterprise Data Warehouse initiative is a high investment project. The adoption of Data Warehouse will be significantly different depending upon the level of readiness of an organization. Before implementation of Data Warehouse system in a firm, it is necessary to evaluate the level of the readiness of firm. A successful Data Warehouse assessment model requires a deep understanding of opportuni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008